The 2019 - 2020 Coronavirus pandemic will be one of the most influential health crisis of the 21st century. COVID-19, the strain responsible for the pandemic, is a new strain of Coronavirus causing respiratory stress and disease. The economic impact of the COVID-19 pandemic will continue to be felt for months after the virus has gone. Over the past several weeks, the global share market has dropped, businesses have closed, and governments have introduced totalitarian social distancing and lockdown laws.
In this project, the impact of government responses and indicators on the coronavirus pandemic is explored. The visualisations explore fundamental trends, the different lockdown measures that have been introduced and how government indicators have influenced the pandemic.
Below are links to Jupyter Notebooks containing the explanations for the visualisations presented in this website, the datasets utilised and the original code.
Before you continue through to the visualisations, let us explain how the website works. Down the bottom right-hand corner of the website, you will see that there is a set of arrows to navigate through the website. Alternatively, you can use the arrows on your computer keyboard. To move between visualisations, please use the right arrow. For each visualisation, there are three parts: the introduction, visualisation and conclusion. Use the right arrow to move between visualisations and the down arrow to move between the introduction, visualisation and conclusion. You can use the up and left arrows at any time to go back to previous sections of the website.
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import folium
import sklearn as sk
import sklearn.preprocessing as prep
import pydotplus
import math
import bokeh
import plotly
import plotly.offline as offline
import plotly.express as px
import plotly.io as pio
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from matplotlib.colors import Normalize
from numpy.random import rand
from IPython.display import HTML
from scipy import stats
from folium import plugins
from folium.plugins import HeatMapWithTime, MarkerCluster, HeatMap
from bokeh.models import ColumnDataSource, FactorRange, Legend, HoverTool, LinearColorMapper, CDSView, GroupFilter, CustomJS, Div, ColorBar
from bokeh.plotting import figure
from bokeh.transform import transform, factor_cmap
from bokeh.io import output_notebook, show
from bokeh.layouts import row, column, gridplot, layout
from bokeh.palettes import Category20_14, plasma, viridis
from IPython.display import Image
from sklearn.tree import export_graphviz
from sklearn.externals.six import StringIO
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LinearRegression
from scipy.signal import find_peaks
Setup bokeh for jupyter notebooks:
output_notebook()
Set pandas view settings to make it easier to inspect dataframes:
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
Set global matplotlib plot style to make it nicer:
sns.set(style='ticks', palette='muted', color_codes=True)
Set up uniform styling constants:
fontsize_labels_px = "14px"
fontsize_titles_px = "16px"
fontsize_titles = 16
fontsize_labels = 14
plot_width = 760
plot_height = 600
cases_data = pd.read_csv("../datasets/cases_by_day.csv", index_col=0)
deaths_data = pd.read_csv("../datasets/deaths_by_day.csv", index_col=0)
recovered_data = pd.read_csv("../datasets/recovered_by_day.csv", index_col=0)
responses_data = pd.read_csv("../datasets/corona_policies_cleaned.csv",
index_col=0)
worldbank_data = pd.read_csv("../datasets/worldbank_and_press_freedom.csv",
index_col=1)
oxford_new_data = pd.read_csv("../datasets/oxford_new_cleaned.csv",
index_col=0)
Rename the first column of the worldbank dataset so it can be addressed by name:
worldbank_data = worldbank_data.rename(
columns={worldbank_data.columns[0]: "country"})
Remove duplicated india row:
worldbank_data = worldbank_data.drop_duplicates(subset=['country'])
Only keep interesting columns of the government-responses dataset:
responses_relevant = responses_data.drop(columns=[
'ADMIN_LEVEL_NAME', 'PCODE', 'LOG_TYPE', 'NON_COMPLIANCE', 'SOURCE',
'SOURCE_TYPE', 'LINK', 'Alternative source', 'ENTRY_DATE'
])
Remove rows that don't contain the date of implementation:
responses_notna = responses_relevant.dropna(subset=['DATE_IMPLEMENTED'])
Parse the dates of implementation:
responses_notna['DATE_IMPLEMENTED'] = pd.to_datetime(
responses_notna['DATE_IMPLEMENTED'], format='%Y-%m-%d')
Oxford dataset (new format): parse the dates
oxford_new_data['Date'] = pd.to_datetime(oxford_new_data.Date, format="%Y%m%d")
... and only keep the dates that match those in the other datasets:
oxford_new_data = oxford_new_data[oxford_new_data.Date >= '2020-01-22']
Remove data for countries that have less than 100 total cases overall, as they cannot be considered significative:
non_significative = [c[0] for c in cases_data.iterrows() if c[1].max() < 100]
significative_cases_data = cases_data.drop(non_significative)
Parse dates for all of the COVID19 data:
significative_cases_data.columns = pd.to_datetime(
significative_cases_data.columns)
cases_data.columns = pd.to_datetime(cases_data.columns)
deaths_data.columns = pd.to_datetime(deaths_data.columns)
recovered_data.columns = pd.to_datetime(recovered_data.columns)
The first visualisation shows an overview of the number of COVID-19 cases in all countries at a given date. The slider down the bottom of the plot can be used to vary the data and show it for different dates. Dragging the slider shows the development of pandemic. Hovering over a country will show the total number of cases at the chosen date.
plotly.offline.init_notebook_mode()
Create logarithmic colour scale:
colorscale= [
[0, '#ffffff'],
[1./10000, '#c3ffc1'],
[1./1000, '#f5ffa0'],
[1./100, '#ffe98f'],
[1./10, '#ffc277'],
[1., '#ff8383']]
Create slider data. Each entry is one day. z is the variable to be plotted, and in our case the daily cases. Also making hovertext with deaths, recovered and mortality.
data_slider = []
i=1
for col in cases_data.columns:
if(i==len(cases_data.columns)-1):
break
data = dict(
type='choropleth',
colorscale = colorscale,
autocolorscale = False,
locations = cases_data.index,
z = cases_data.iloc[:,i].astype(int),
zmax = int(cases_data.iloc[:,1:-1].max().max()),
zmin = 0,
locationmode = 'ISO-3',
text = "Deaths: "+deaths_data.iloc[:,i].astype(int).astype(str)+
" Recovered: "+recovered_data.iloc[:,i].astype(int).astype(str)+
" Mortality: "+(round(deaths_data.iloc[:,i].astype(int)/
cases_data.iloc[:,i].astype(int)*100, 2)).astype(str)+"%",
marker = dict(
line = dict(
color = 'rgb(255,255,255)',
width = 0.5)),
colorbar = dict(
title = "Number of cases")
)
i=i+1
data_slider.append(data)
Format the dates for the slider:
dates = []
for col in cases_data:
dates.append(col.strftime('%d-%b'))
Create setps for the slider:
steps = []
for i in range(len(data_slider)):
step = dict(method='restyle',
args=['visible', [False] * len(data_slider)],
label= format(dates[i]))
step['args'][1][i] = True
steps.append(step)
sliders = [dict(active=0, pad={"t": 1}, steps=steps)]
Create fgure layout:
layout = dict(
title= " Overview of the COVID-19 pandemic",
geo = dict(projection={'type':'natural earth'}),
sliders=sliders)
fig = dict(data=data_slider, layout=layout)
Show the figure:
pio.write_html(fig, "fig_1.html")
Unfortunately, the interactive slider plot we originally made is too heavy to include on the website. We have instead created a GIF to show the visualisation. If you would like to view the original visualisation, please see here: https://ldorigo.github.io/visualisation_project/fig_1.html
HTML('<img src="./fig_1.gif" width="750" align="center">')
The general trend shows the outbreak firstly isolated in China, before spreading to Europe and North America. From the visualisation, we can see that the number of cases has increased at the greatest rate over the last month shown in the visualisation. The plot also shows that while some countries have over 1 million cases, others are still waiting for larger numbers of the population to be infected, for example, countries in Africa. What is now interesting to visualise is the overall statistics of the number of cases.
The following plot shows the general evolution of the number of confirmed COVID-19 cases, COVID-19 related deaths and recovered patients. For each category of patient, two trends are presented, an exponential trend and a logarithmic trend.
Create a list of x tick labels for the plot:
x_ticks = []
xticklabels = []
i = 0
for col in cases_data:
if (i % 15 == 0):
x_ticks.append(col)
xticklabels.append(col.strftime("%d/%m"))
i = i + 1
Initialise the figure and different subplots:
fig, axs = plt.subplots(3,
2,
figsize=(12, 8),
dpi=100,
sharex='col',
gridspec_kw={
'hspace': 0.2,
'wspace': 0.15
});
(ax1, ax2), (ax3, ax4), (ax5, ax6) = axs;
Plot each of the subplots. Within this code, use the .sum() function to count the number of cases per date in the dataset to allow this information to be ploted:
ax1.plot(cases_data.columns, cases_data.sum(), 'tab:orange')
ax2.plot(cases_data.columns, cases_data.sum(), 'tab:orange')
ax3.plot(deaths_data.columns, deaths_data.sum(), 'tab:red')
ax4.plot(deaths_data.columns, deaths_data.sum(), 'tab:red')
ax5.plot(recovered_data.columns, recovered_data.sum(), 'tab:green')
ax6.plot(recovered_data.columns, recovered_data.sum(), 'tab:green')
Add titles to the plot:
fig.suptitle('COVID-19 Cases Summary', fontsize=24)
ax1.set(title="Confirmed Cases - Exponential")
ax2.set(title="Confirmed Cases - Logarithmic", yscale="log")
ax3.set(title="Total Deaths - Exponential")
ax4.set(title="Total Deaths - Logarithmic", yscale="log")
ax5.set(title="Recovered Patients - Exponential")
ax6.set(title="Recovered Patients - Logarithmic", yscale="log")
Set the size of the subplots:
ax1.title.set_size(18)
ax2.title.set_size(18)
ax3.title.set_size(18)
ax4.title.set_size(18)
ax5.title.set_size(18)
ax6.title.set_size(18)
Add the tick labels:
for ax in axs.flat:
ax.set_xticks(x_ticks)
ax.set_xticklabels(xticklabels)
# ax.tick_params('x', rotation = 40)
ax.tick_params('y', labelsize='small')
for tick in ax.xaxis.get_major_ticks():
tick.label.set_fontsize(12)
for tick in ax.yaxis.get_major_ticks():
tick.label.set_fontsize(12)
Add two main axis labels
fig.text(0.035,
0.5,
"Cases",
verticalalignment="center",
rotation=90,
fontsize=16)
fig.text(0.5,
0.035,
"Date",
horizontalalignment="center",
rotation=0,
fontsize=16)
display(fig)
The cases summary visualisation shows the exponential and logarithmic growth of each category of COVID-19 Case. When looking at the exponential trends, we can see that the steep exponential increase in the number of cases began around the 22nd of March 2020. This trend occurred over a month after the World Health Organisation declared the pandemic on the 11th of February 2020. Comparatively, when we analyse the logarithmic trend plot, we can see that a carrying capacity (observed maximum number of cases) was observed around the 21st of February. However, this was not the true carrying capacity as the pandemic continued to evolve and multiply after this date which, is where the second increase in cases is observed.
An epidemic or pandemic curve is a form of statistical chart which is used to visualise the evolution of a disease or virus outbreak. This form of visualisation can be used to map the different stages of a pandemic but also determine when different stages of a disease or virus outbreak occur. The term "flatten the curve" has been heard continuously over the last few months during the COVID-19 pandemic. What this refers to is the pandemic curve. Authorities want to bring the peak of the curve down, so the health care systems worldwide are not overwhelmed. For this visualisation, only specific countries have been analysed; these include Australia, China, Denmark, France, India, Iran, Italy, Mexico, Sweden and the USA. An important point to note is that the pandemic curve represents the number of cases on a given day and does not account for any recovered patients or deaths. The data here is also shown as the number of cases per population of the above countries. You will be able to see the fraction of the population that has been infected and compare this to other countries. For this visualisation, you may choose the countries you wish to visualise by clicking and unclicking the country abbreviation on the left-hand side of the visualisation.
To see the true shape of the pandemic curve, only the number of active cases per day is visualized. Currently, the dataset in its original format is cumulative, which, is not suitable if we only want to show the number of active cases per day. Therefore, the deaths and recovered datasets need to be subtracted from the cases dataset.
# calculate the daily number of active cases
daily_cases_sub_recovered = cases_data.subtract(recovered_data)
daily_cases_data = daily_cases_sub_recovered.subtract(deaths_data)
Due to a large number of countries in the datasets, for the pandemic curve plot, only specific focus countries were analyzed. These included Australia, China, Denmark, France, India, Iran, Italy, Mexico, Sweden and the USA.
# initialise focus countries
focus_countries = set(
["AUS", "CHN", "DNK", "FRA", "IND", "IRN", "ITA", "MEX", "SWE", "USA"])
# separate the focus countries from the data
data_focused = daily_cases_data.loc[focus_countries]
data_focused = data_focused.T
pops = worldbank_data.loc[focus_countries, 'population']
norm_data = data_focused / pops
Convert the Pandas Dataframe to Bokeh ColumnDataSource:
source = ColumnDataSource(norm_data)
Create the Bokeh plot:
# create a new figure, specify the x_range, plot_width and title
p = figure(
x_axis_type='datetime',
plot_width=760,
title="Pandemic Curve",
)
# to store vbars
bar = {}
# loop through each of the focus_countries
for i, country in enumerate(focus_countries):
# create a vbar for each focus_countries
bar[country] = p.vbar(width=50000000, # Seriously, Bokeh? ...
alpha=0.5,
x="index",
top=country,
source=source,
muted_alpha=0.05,
muted=True,
color=Category20_14[i])
# add a legend to the plot
legend_items = [(i, [bar[i]]) for i in focus_countries]
legend = Legend(items=legend_items, location=(0, 110), click_policy="mute")
p.add_layout(legend, 'left')
# add trimmings to the plot and adjust font sizes
p.xaxis.axis_label = "Date"
p.yaxis.axis_label = "Relative Frequency"
p.title.text_font_size = '24pt'
p.xaxis.axis_label_text_font_size = '16pt'
p.yaxis.axis_label_text_font_size = '16pt'
p.xaxis.major_label_orientation = math.pi / 3
Show the plot:
show(p)
From this visualisation, we can see that every country is at a different stage of the pandemic. Some countries, for example, Australia and China, have successfully flattened the curve compared to countries like Italy and the USA where a large percentage of the population has been infected with the virus. From the plot, we can see that the virus in China and Australia has successfully been contained and these countries are entering the final stages of the pandemic as signified by the dropping off of their pandemic curve. Comparatively, for countries like the USA, Sweden and France, the pandemic is still in the beginning stages as the number of cases continues to climb. For some countries, their pandemic curve follows a clear bell curve trend. While for others like France and Denmark, the curve is a bit more jagged. These trends are all related to the types of government measures that have been introduced, which are explored in the following visualisation.
To get an overview of how different countries reacted to the outbreak, we have assembled a visualisation that shows the specific measures implemented by each country, plotted by the time since outbreak and number of cases at the time of implementation. You can select different categories and obtain more information on specific measures by hovering them.
To be able to show the time it took for each government to implement specific policies, we first need to compute the start data of the pandemic in each country. We set this to be the day at which the country has more than 50 cases: although arbitrary, this number is a good compromise between not leaving too many countries out (i.e., countries with very few cases) and not having too much noise (as countries with extremely few cases are often outliers and make the data messy).
We start by computing the number of days before the start of the outbreak in each country:
firstdays = []
firstdays_dict = {}
for _row in significative_cases_data.iterrows():
for index, val in enumerate(_row[1]):
if val > 50:
firstdays.append(int(index))
firstdays_dict[_row[0]] = int(index)
break
if index == len(_row[1]) - 1:
firstdays_dict[_row[0]] = len(_row[1])
Start building a dataframes with the data we need:
df_responses_in_time = pd.DataFrame(
index=significative_cases_data.index
) # these are the ISO codes for the countries we're interested in
df_responses_in_time = df_responses_in_time.assign(
days_before_first_case=firstdays)
To make it easier to compute time intevals, we include a column with the (constant) date at which the cases start:
firstday_dates = pd.Series(pd.to_datetime(['22-01-2020'])).repeat(
len(df_responses_in_time))
df_responses_in_time = df_responses_in_time.assign(
initial_day=firstday_dates.values)
We then use the number of days before the start of the outbreak to make a column containing the start date in each country:
# Add column with the date of the first case
initial_dates = []
for _row in df_responses_in_time.iterrows():
# print("row: {}".format(row))
initial_dates.append(_row[1].initial_day +
pd.DateOffset(days=_row[1].days_before_first_case))
df_responses_in_time = df_responses_in_time.assign(
first_case_date=initial_dates)
Drop rows that contain NaNs. This makes it easier to plot, and while we lose some data, we're only interested in trends, so it shouldn't matter:
df_responses_in_time = df_responses_in_time.dropna()
We now need to merge (join) the "government measures" data to our dataframe:
cases_and_measures_df = df_responses_in_time.merge(responses_notna,
left_index=True,
right_on='ISO',
how='right')
Compute the number of days that passed between the start of the outbreak and the implementation of each measure (in a specific country):
cases_and_measures_df = cases_and_measures_df.assign(
implementation_after_pandemic=(cases_and_measures_df.DATE_IMPLEMENTED -
cases_and_measures_df.initial_day).dt.days)
cases_and_measures_df = cases_and_measures_df.assign(
implementation_after_case_in_country=(
cases_and_measures_df.DATE_IMPLEMENTED -
cases_and_measures_df.first_case_date).dt.days)
Again, remove NaNs to make plotting easier:
cases_and_measures_df = cases_and_measures_df.dropna()
Drop measures that happened before the 22 january (since that's when the data starts):
cases_and_measures_df = cases_and_measures_df[
(cases_and_measures_df.DATE_IMPLEMENTED > '2020-01-22')
& (cases_and_measures_df.DATE_IMPLEMENTED < '2020-05-01')]
Compute the number of cases in the country at the time that each measure was implemented:
measure_dates = pd.to_datetime(significative_cases_data.T.index)
cases_by_date = significative_cases_data.T.reindex(index=measure_dates)
cases_on_measure = []
for measure in cases_and_measures_df.iterrows():
mdate = measure[1].DATE_IMPLEMENTED
miso = measure[1].ISO
cases_on_measure.append(cases_by_date.loc[mdate, miso])
cases_and_measures_df['cases_on_measure'] = cases_on_measure
Define what should be displayed on hover:
TOOLTIPS = [("Country", "@COUNTRY"),
("Cases on day of implementation", "$y{int}"),
("Days passed since first case in country", "$x"),
("Policy", "@MEASURE - @COMMENTS")]
We generate many colors to represent different countries. The colors don't carry any meaning, it's just to make it easier to distinguish countries.
# colors = [bokeh.colors.RGB(np.random.randint(0,255),np.random.randint(0,255),np.random.randint(0,255)) for i in range(500)]
colors = viridis(len(focus_countries))
Initialize bokeh datasource:
subset = cases_and_measures_df[cases_and_measures_df.ISO.isin(focus_countries)]
source_1 = ColumnDataSource(subset)
COUNTRIES = list(subset.COUNTRY.unique())
Initialize the figure:
measures_plot = figure(width=plot_width,
height=plot_height,
y_axis_type="log",
x_range=(0, 70),
title='Comparison of the timings of measures')
Generate plots for all types of measures:
markers = [
measures_plot.square, measures_plot.cross, measures_plot.circle,
measures_plot.x, measures_plot.triangle, measures_plot.diamond,
measures_plot.square_cross
]
categories = {}
for i, cat in enumerate(cases_and_measures_df.CATEGORY.unique()):
view = CDSView(source=source_1,
filters=[GroupFilter(column_name='CATEGORY', group=cat)])
visible = i == 0
categories[cat] = markers[i](source=source_1,
view=view,
x='implementation_after_case_in_country',
y='cases_on_measure',
color=factor_cmap('COUNTRY', colors,
COUNTRIES),
size=10,
visible=visible)
Add a clickable legend to the plot:
legend_items = [(i, [categories[i]]) for i in categories.keys()]
legend = Legend(items=legend_items, location=(350, 50), click_policy="hide")
Set plot trimmings:
measures_plot.add_layout(legend, 'center')
measures_plot.xaxis.axis_label = "Days since first case in the country"
measures_plot.yaxis.axis_label = "Cases on the day the measure was introduced"
measures_plot.title.text_font_size = fontsize_titles_px
measures_plot.xaxis.axis_label_text_font_size = fontsize_labels_px
measures_plot.yaxis.axis_label_text_font_size = fontsize_labels_px
Add hover tool to the plot:
measures_plot.add_tools(
HoverTool(tooltips=TOOLTIPS, renderers=list(categories.values())))
show(measures_plot)
What is interesting in this visualisation is that many countries have introduced measures over time, rather than all at once. We can see that each time a new measure is introduced, the cases at the time the measure was introduced has increased from the last time the government introduced a measure. What this trend suggests is that the governments are actively monitoring the situation and are responding to the growing number of cases with stricter measures in an attempt to curve the spread of the virus. There are many different reasons why different measures are introduced; however questions like have some countries been quicker to put the country into lockdown because they don’t have the health system capacity, are interesting to explore. The following plots explore the government measures and how the healthcare system may have contributed to the different government decisions regarding the pandemic.
To start the more analytical part of this investigation, let's look at how countries' GDP and expenditure on healthcare influence the spread of the disease. Notice the red line, which represents the linear regression of the number of cases (adjusted by population) on both of the indicators mentioned above. Hovering over a point in this visualisation will reveal the country.
We get both the GDP and Healthcare expenditure data directly from the worldbank dataset:
plot_data = worldbank_data.loc[:, [
"gdp_per_capita", "health_expenditure_per_capita", "population"
]]
Remove the country San Marino - it's hardly a country, and skews all of the other data:
plot_data = plot_data.drop('SMR')
Compute the total amount of cases of coronavirus in each countries. Since the COVID dataset we have is cumulative, this is simply the amount of cases on the latest date in the dataset:
country_sum = significative_cases_data.iloc[:, -1]
plot_data = plot_data.assign(cases=country_sum)
Divide the number of cases by population to allow comparing the number of cases across countries:
plot_data = plot_data.assign(cases_per_capita=plot_data.cases /
plot_data.population)
Yet again, drop rows that contain NaNs to make plotting possible:
data_complete = plot_data.dropna()
Initialize the bokeh data source and the hover tooltips:
source = ColumnDataSource(data_complete)
TOOLTIPS = [
("Country", "@country_code"),
]
We will make two similar plots, one for GDP and one for Health Expenditure.
gdp_plot = figure(tooltips=TOOLTIPS,
width=int(0.55 * plot_width),
height=plot_height)
health_plot = figure(tooltips=TOOLTIPS,
width=int(0.45 * plot_width),
height=plot_height)
plots = [gdp_plot, health_plot]
indicators = ['gdp_per_capita', 'health_expenditure_per_capita']
titles = ['GDP per Capita', 'Healthcare Expenditure per Capita']
For each of the indicators, we plot all the countries and fit a regression line on the indicator.
for i, (title, plot, indicator) in enumerate(zip(titles, plots, indicators)):
plot.circle(x=indicator,
y='cases_per_capita',
source=source,
size=10,
fill_alpha=0.6)
X = data_complete[indicator].values.reshape(-1, 1)
y = data_complete.cases_per_capita
reg = LinearRegression().fit(X, y)
y_predicted = reg.predict(data_complete[indicator].values.reshape(-1, 1))
score = reg.score(X, y)
plot.line(data_complete[indicator],
y_predicted,
color='red',
legend_label="R^2: {}".format(score))
plot.xaxis.axis_label = title
if i == 0:
plot.yaxis.axis_label = "Cases (adjusted by population)"
plot.title.text_font_size = fontsize_titles_px
plot.xaxis.axis_label_text_font_size = fontsize_labels_px
plot.yaxis.axis_label_text_font_size = fontsize_labels_px
health_plot.yaxis.major_label_text_font_size = "0pt"
Format the plot with a title:
titlediv = row(Div(
text=
"<h1 style='text-align: center'> Influences of GDP and Healthcare Expenditure on COVID19 cases</h1>"
),
sizing_mode='scale_width')
plots = gridplot([[gdp_plot, health_plot]])
health_plot = column([titlediv, plots])
Although not shown in the visualizations, out of curiosity, we try multiple regression to see if using both indicators is better for predicting the number of cases:
X = data_complete[indicators].values
y = data_complete.cases_per_capita.values
regression = LinearRegression().fit(X, y)
text = "R-square for multiple regression on both GDP and health expenditure: {}. This is slightly better than when just considering the GDP.".format(
regression.score(X, y))
print(text)
show(health_plot)
Surprisingly, it appears that greater GDPs and greater expenditures on healthcare are positively correlated with the number of cases in the country. Countries like the USA and Luxembourg have a greater GDP and spend more on healthcare; however, have a large number of cases per population. This trend is likely to be influenced by the fact that the coronavirus spread initially through many wealthy countries in Europe and thus they have been experiencing the pandemic for an extended period which, results in an increase in cases. However, this could also be explained by government stances relating to healthcare. For countries with better healthcare systems, maybe the governments felt better poised to treat patients with the virus and thus waited longer to put the country into lockdown. This is in comparison to poorer countries who may have gone straight into lockdown. The next visualisation explores this trend.
We now want to look at how the capacity of a countries healthcare system impacted the speed of government responses. Questions like "how does the strength of a country's healthcare system influence how quickly that country responds to the pandemic?" are to be answered. Here, we can see how the healthcare coverage index is related to the time that it took for the government to implement measures that correspond to 50 or more on the Oxford Stringency Index. The Oxford stringency index quantifies the different government measures, how strict they are and how well the government has responded. An important note here is that the size of a countries "bubble" on this plot is related to the capacity of the healthcare system. Hovering over a point in this visualisation will reveal the country and numerical information.
We start by only keeping rows in the worldbank data that have the info we need:
worldbank_data_with_uhc = worldbank_data[
worldbank_data.universal_healthcare_coverage_index.notna()]
worldbank_data_with_hospitalbeds = worldbank_data_with_uhc[
worldbank_data_with_uhc.hospital_beds_per_1000.notna()]
We then merge that with the Oxford Stringency Index dataset:
oxford_with_worldbank = oxford_new_data.merge(worldbank_data,
left_on='CountryCode',
right_index=True,
how='left')
Initially, our plan was to visualise the time it took countries to go into lockdown. However, measuring this "speed of lockdown" doesn't work - it's hard to define a lockdown, and not all countries had one. Instead, we calculate the number of days until the stringency index is more than 50.
countrycodes = oxford_with_worldbank.CountryCode.unique()
days_before_high_stringency = {}
for cc in countrycodes:
country_series = oxford_with_worldbank[oxford_with_worldbank.CountryCode ==
cc]
for i, index in enumerate(country_series.StringencyIndex):
if index > 50:
days_before_high_stringency[cc] = i
break
days_from_outbreak_to_high_stringency = {}
for cc in countrycodes:
if (cc not in days_before_high_stringency.keys()) or (
cc not in firstdays_dict.keys()):
# print("missing cc: " + cc)
continue
else:
days_from_outbreak_to_high_stringency[
cc] = days_before_high_stringency[cc] - firstdays_dict[cc]
from_outbreak_to_high_stringency_df = pd.DataFrame(
list(days_from_outbreak_to_high_stringency.values()),
index=days_from_outbreak_to_high_stringency.keys())
dataset_for_plot = worldbank_data.merge(from_outbreak_to_high_stringency_df,
left_index=True,
right_index=True,
how="inner").rename(
{0: "from_outbreak_to_stringent"},
axis="columns")
As before, remove rows with NaNs:
dataset_for_plot = dataset_for_plot.dropna(
subset=['universal_healthcare_coverage_index', "hospital_beds_per_1000"])
To make it a little easier to handle, we only keep the columns we need:
dataset_small = dataset_for_plot.loc[:, [
"hospital_beds_per_1000", "universal_healthcare_coverage_index",
"from_outbreak_to_stringent", "country"
]]
Normalize the number of beds per 10.000 people so it renders nicely:
beds = dataset_small['hospital_beds_per_1000']
beds = 0.3 + (beds / (max(beds))) * 2
dataset_small['beds_normalized'] = beds
Initialize tooltips, figure, and colors for the plot:
TOOLTIPS = [("Country", "@country"),
("Hospital beds per 1.000 people", "@hospital_beds_per_1000"),
("Days between outbreak start and high stringency",
"@from_outbreak_to_stringent"),
("Universal Healthcare coverage index",
"@universal_healthcare_coverage_index")]
# We just generate many random colors to represent different countries...
colors = [
bokeh.colors.RGB(np.random.randint(0, 255), np.random.randint(0, 255),
np.random.randint(0, 255)) for i in range(500)
]
speed_of_reaction_plot = figure(
width=plot_width,
height=plot_height,
title='Rapidity of response as a function of the healthcare capacity')
Setup Bokeh data source:
source_2 = ColumnDataSource(dataset_small)
COUNTRIES = list(dataset_small.country.unique())
Plot each country as a circle whose size represents the number of hospital beds per 10.000 people:
speed_of_reaction_plot.circle(
source=source_2,
x='universal_healthcare_coverage_index',
y='from_outbreak_to_stringent',
radius='beds_normalized',
fill_alpha=0.6,
color=factor_cmap('country', colors, COUNTRIES),
)
speed_of_reaction_plot.add_tools(HoverTool(tooltips=TOOLTIPS))
speed_of_reaction_plot.xaxis.axis_label = "Universal Healthcare Coverage Index"
speed_of_reaction_plot.yaxis.axis_label = "Days between outbreak in country and stringency index >50"
speed_of_reaction_plot.title.text_font_size = fontsize_titles_px
speed_of_reaction_plot.xaxis.axis_label_text_font_size = fontsize_labels_px
speed_of_reaction_plot.yaxis.axis_label_text_font_size = fontsize_labels_px
speed_of_reaction_plot.title.align = 'center'
show(speed_of_reaction_plot)
The idea behind this plot was to see if countries with a powerful healthcare system (and with a lot of hospital beds) could "afford" to wait longer before implementing strong measures. And, indeed, this appears to be the case: the time it took for countries to enforce stringent policies seems to increase with the healthcare coverage index, and countries with many hospital beds per people ("big bubbles") tend to be on the higher part of the plot.
Note that as always, there may be many confounding factors at play. For instance, the coronavirus hit Europe first, and Europe has many of the most developed healthcare systems in the world, meaning that countries with "better healthcare" may have had less time overall to plan and implement policies. Other government factors may have also come into play, which, is explored in the next visualisation.
Let's look at the influence that various political and economic indicators have on a country's ability to face the pandemic. On the horizontal axis, we have different indicators that represent multiple aspects of a country's government. On the vertical axis, we have, for various ranges of each indicator, the proportion of countries in that range that have succeeded in curbing the pandemic - i.e., countries whose pandemic curve has reached a peak.
For this visualisations, we need to compute the countries whose outbreak has reached a peak. We do this using scipy's built-in function "find_peaks", as well as with some custom logic to check if the country has not peaked yet (as that is not detected by scipy)
days_before_peak = {}
for cc in countrycodes:
try:
testseries = daily_cases_data.loc[cc]
except KeyError:
continue
try:
tentative_peak = find_peaks(testseries, distance=1000)[0][0] # distance=1000 to only return the highst peak.)
except IndexError:
# If no peaks are found, the curve hasn't reached a peak at all:
peak = len(testseries)
has_peaked = False
# If the curve hasn't reached a peak yet,the function returns a wrong value.
# we just check if the latest value is higher than the found peak
if testseries[-1] > testseries[tentative_peak]:
# subtract the days until outbreak to get the actual speed in that country:
peak = len(testseries)
has_peaked = False
else:
peak = tentative_peak
has_peaked = True
try:
firstday = firstdays_dict[cc]
except KeyError:
continue
days_before_peak[cc] = [peak - firstday, has_peaked]
peaks_df = pd.DataFrame(days_before_peak.values(),
index=days_before_peak.keys())
peaks_df.columns = ['days_before_peak', 'has_peaked']
peaks_and_wb = peaks_df.merge(worldbank_data,
left_index=True,
right_index=True,
how='inner')
Choose some indicators that we want to investigate:
indicators = [
"political_stability", "government_effectiveness",
"voice_and_accountability", 'rule_of_law',
'self_payed_health_expenditure_percent_of_total', "freedom_score",
"corruption_control", "regulatory_quality"
]
titles = [
'Political Stability', 'Government Effectiveness',
'Voice and Accountability', 'Rule of Law',
'Percentage of Self-Payed Healthcare Expenditure', 'Freedom of Press',
'Corruption Control', 'Regulatory Quality'
]
Make a small "histogram" for each subplot. Note that here the bar height doesn't correspond to frequency, as in a normal histogram, but rather to the proportion of countries in that bin who have successfully curbed the epidemic:
figures = []
for index, indicator in enumerate(indicators):
vrange = np.linspace(peaks_and_wb[indicator].min(),
peaks_and_wb[indicator].max(),
num=10)
dist = vrange[2] - vrange[1]
middlepoints = [(vrange[i] + vrange[i + 1]) / 2
for i in range(len(vrange) - 1)]
averages = []
in_bin = []
amount_in_bin = []
for i in range(len(vrange) - 1):
start = vrange[i]
stop = vrange[i + 1]
countries = peaks_and_wb[(peaks_and_wb[indicator] >= start)
& (peaks_and_wb[indicator] < stop)]
in_bin.append(", ".join(list(countries.country)))
if len(countries) == 0:
averages.append(0)
continue
count = 0
for c in countries.iterrows():
if c[1].has_peaked:
count += 1
avg = count / len(countries)
averages.append(avg)
TOOLTIPS = [("Countries in this bin", "@countries_peaked"),
(indicator, "@midpoints"),
("Proportion of countries that have reached a peak",
"@proportion_peaked")]
bin_df = pd.DataFrame()
bin_df['midpoints'] = middlepoints
bin_df['proportion_peaked'] = averages
bin_df['countries_peaked'] = in_bin
source_bin = ColumnDataSource(bin_df)
binary_plot = figure(tooltips=TOOLTIPS,
y_range=(0,1),
width=250,
height=160,
title=titles[index])
# for mp, av in zip(middlepoints, averages):
binary_plot.vbar(source=source_bin,
x="midpoints",
top="proportion_peaked",
width=dist - 0.05 * dist)
binary_plot.xaxis.axis_label_text_font_size = fontsize_labels_px
binary_plot.yaxis.axis_label_text_font_size = fontsize_labels_px
figures.append(binary_plot)
Add a title to the subplots:
titlediv = Div(
text=
"<h1> Influence of various indicators on the ability to curb the epidemic </h1>"
)
figures = [titlediv] + figures
grid = gridplot([figures[0:3], figures[3:6], figures[6:9]])
show(grid)
Some of the trends were to be expected: for instance, the proportion of countries that managed to control the epidemic is greater for countries whose "Government effectiveness" and "Rule of law" are high. On the other hand, some trends are quite surprising and interesting: "Voice and Accountability" (which is a measure of democracy) has two peaks on both sides of the spectrum. Our interpretation is that countries with strong, totalitarian governments (the peak on the left of the graph), where a citizen has little choice but to obey the government's decisions, are better able to react swiftly to a crisis like this than weak democracies (the valley in the middle of the graph). On the other hand, strong democracies (the peak on the right of the graph) mostly correspond to wealthy European countries, which are better able to face the crisis due to many other reasons.
What is now interesting to analyse is how the stringency indexes have evolved and their impact on the pandemic curve, since we now understand some of the factors that have influenced government measures.
As we explored how the pandemic spread across the world, we will now take a look into how the government measures have changed over time. This is represented using the oxford stringency index. The slider down the bottom of the plot can be used to vary the data and show it for different dates. Dragging the slider shows the change in stringency index as the pandemic evolved. Hovering over a country will provide further numerical information. This visualisation is based off the visualisations presented by the University of Oxford [4].
Import the dataset and remove the relevant columns and store in a new dataframe
stringency = pd.read_csv("../datasets/OxCGRT_latest.csv", )
stringency_df = pd.DataFrame(
stringency,
columns=['CountryName', 'CountryCode', 'Date', 'StringencyIndex'])
Some countries do not have complete data for the date range 01/01/2020 to the 24/04/2020 sthey are removed here:
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("CPV")]
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("LSO")]
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("MAC")]
Initially, the dataset listed the stringency index per day per country as individual rows within the dataset. The dataset was manipulated to have the countries along with the y index of the dataset and the date along with the top index. This allowed for the data to be presented as a large matrix and is easier to plot.
stringency_df.drop(stringency_df.tail(1).index, inplace=True)
stringency_df["StringencyIndex"] = pd.to_numeric(
stringency_df["StringencyIndex"])
stringency_df2 = stringency_df
stringency_df2.set_index(['CountryCode', 'Date'])
stringency_df3 = stringency_df
dates = stringency_df3['Date'].unique()
stringency_df4 = pd.DataFrame(columns=dates)
stringency_df4['CountryCode'] = stringency_df['CountryCode'].unique()
stringency_df5 = stringency_df4.set_index(['CountryCode'])
for i in range(147): #151 countries
tmp = np.asarray(stringency_df['StringencyIndex'].iloc[i * 115:115 +
115 * i])
stringency_df5.iloc[i, :] = tmp
The last row of the dataset consisted of NaN values. This country, however, did not have data for the entire date range, so it is okay to remove it here:
stringency_df5 = stringency_df5.iloc[:147]
For consistency and to compare between datasets, there must be a common date range. We must remove data before the 22/01/20 to be consisten with the cases dataset:
stringency_df5 = stringency_df5.drop([
'1/01/2020', '2/01/2020', '3/01/2020', '4/01/2020', '5/01/2020',
'6/01/2020', '7/01/2020', '8/01/2020', '9/01/2020', '10/01/2020',
'11/01/2020', '12/01/2020', '13/01/2020', '14/01/2020', '15/01/2020',
'16/01/2020', '17/01/2020', '18/01/2020', '19/01/2020', '20/01/2020',
'21/01/2020'
],
axis=1)
Ensure datetime
stringency_df5.columns =pd.to_datetime(stringency_df5.columns, format ='%d/%m/%Y')
Set colours
cscale= [[0, 'rgb(68, 1, 84)'],
[0.33, 'rgb(49, 104, 142)'],
[0.66, 'rgb(109, 205, 89)'],
[1, 'rgb(253, 231, 37)'],]
Create slider and add slider data. Here each entry is stringency index for countries at given date
data_slider = []
for col in stringency_df5.columns:
if(i==len(stringency_df5.columns)-1):
break
#if (col in deaths_data.columns):
# text ='Deaths so far: '+ deaths_data.loc[:, :col].sum().astype(int).astype(str)
#else:
# text= ''
data = dict(
type='choropleth',
colorscale = cscale,
autocolorscale = False,
locations = stringency_df5.index,
z = stringency_df5.loc[:,col].astype(float),
zmax = 100,
zmin = 0,
locationmode = 'ISO-3',
#text = text,
marker = dict(
line = dict(
color = 'rgb(255,255,255)',
width = 1)),
colorbar = dict(
title = "Stringency index")
)
data_slider.append(data)
List of dates with correct format
dates = []
for col in stringency_df5.columns:
dates.append(col.strftime('%b -%d'))
Create steps for slider
steps = []
for i in range(len(data_slider)):
step = dict(method='restyle',
args=['visible', [False] * len(data_slider)],
label= format(dates[i]))
step['args'][1][i] = True
steps.append(step)
sliders = [dict(active=0, pad={"t": 1}, steps=steps)]
Create layout and figure
layout = dict(title= "Stringency Index Over Time",
geo = dict(
projection={'type':'natural earth' }),
sliders=sliders)
fig = dict(data=data_slider, layout=layout)
Show figure
pio.write_html(fig, "fig_2.html")
Unfortunately, the interactive slider plot we originally made is too heavy to include on the website. We have instead created a GIF to show the visualisation. If you would like to view the original visualisation, please see here: https://ldorigo.github.io/visualisation_project/fig_2.html
HTML('<img src="./fig_2.gif" width="750" align="center">')
An interesting observation can be made around the first weeks of March, where there is an increase in the stringency index worldwide. When we compare this observation to the summary statistics plot shown at the beginning of the website, we can see that this was around the time when the number of coronavirus cases worldwide began to increase very quickly. Thus explaining the increase in stringency index as governments attempt to stop the spread of the virus.
Now that we have seen the evolution of the stringency index worldwide, it is now interesting to see how this correlates to the number of coronavirus cases. The following visualisation shows the maximum number of cases and the maximum stringency index for each country worldwide. When you hover over a point, you can see which country this corresponds to. A logistic regression trend line is also shown in red. This visualisation is based off the visualisations presented by the University of Oxford [4].
Import the dataset and remove the relevant columns and store in a new dataframe:
stringency = pd.read_csv("../datasets/OxCGRT_latest.csv", )
stringency_df = pd.DataFrame(
stringency,
columns=['CountryName', 'CountryCode', 'Date', 'StringencyIndex'])
Some countries do not have complete data for the date range 01/01/2020 to the 24/04/2020 sthey are removed here:
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("CPV")]
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("LSO")]
stringency_df = stringency_df[~stringency_df.CountryCode.str.contains("MAC")]
Initially, the dataset listed the stringency index per day per country as individual rows within the dataset. The dataset was manipulated to have the countries along with the y index of the dataset and the date along with the top index. This allowed for the data to be presented as a large matrix and is easier to plot.
stringency_df.drop(stringency_df.tail(1).index, inplace=True)
stringency_df["StringencyIndex"] = pd.to_numeric(
stringency_df["StringencyIndex"])
stringency_df2 = stringency_df
stringency_df2.set_index(['CountryCode', 'Date'])
stringency_df3 = stringency_df
dates = stringency_df3['Date'].unique()
stringency_df4 = pd.DataFrame(columns=dates)
stringency_df4['CountryCode'] = stringency_df['CountryCode'].unique()
stringency_df5 = stringency_df4.set_index(['CountryCode'])
for i in range(147): #151 countries
tmp = np.asarray(stringency_df['StringencyIndex'].iloc[i * 115:115 +
115 * i])
stringency_df5.iloc[i, :] = tmp
The last row of the dataset consisted of NaN values. This country, however, did not have data for the entire date range, so it is okay to remove it here:
stringency_df5 = stringency_df5.iloc[:147]
For consistency and to compare between datasets, there must be a common date range. We must remove data before the 22/01/20 to be consisten with the cases dataset:
stringency_df5 = stringency_df5.drop([
'1/01/2020', '2/01/2020', '3/01/2020', '4/01/2020', '5/01/2020',
'6/01/2020', '7/01/2020', '8/01/2020', '9/01/2020', '10/01/2020',
'11/01/2020', '12/01/2020', '13/01/2020', '14/01/2020', '15/01/2020',
'16/01/2020', '17/01/2020', '18/01/2020', '19/01/2020', '20/01/2020',
'21/01/2020'
],
axis=1)
Make sure both datasets use datetime objects:
stringency_df5.columns = pd.to_datetime(stringency_df5.columns)
cases_data.columns = pd.to_datetime(cases_data.columns)
Remove last three columns so both data sets have the same date range
cases_data = cases_data.iloc[:, :-3]
total_cases = cases_data['2020-04-24']
max_stringency = stringency_df5.max(axis=1)
horizontal_stack = pd.concat([total_cases, max_stringency], axis=1)
horizontal_stack.columns = ['Cases', 'Stringency']
Plot the data:
# convert the Pandas Dataframe to Bokeh ColumnDataSource
source_df = horizontal_stack.dropna()
source = ColumnDataSource(source_df)
# initialise the hover boxes
hover = HoverTool(tooltips=[
('Cases', '@Cases'),
('Stringency', '@Stringency'),
('Country', '@index'),
])
# create a new figure, specify the x_range, plot_width and title
p = figure(
plot_width=600,
plot_height=400,
tools=[hover],
title=
"Relationship Between Number of Coronavirus Cases and Government Response",
x_axis_type="log",
)
# plot the data
p.circle('Cases', 'Stringency', size=10, source=source)
# add trimmings to the plot
X = source_df['Cases'].values.reshape(-1, 1)
y = source_df['Stringency'].values
reg = LinearRegression().fit(np.log(X), y)
y_predicted = reg.predict(np.log(X))
score = reg.score(X, y)
p.line(x=source_df.Cases, y=y_predicted, color='red')
p.xaxis.axis_label = "Number of COVID-19 Cases"
p.yaxis.axis_label = "Maximum Stringency Level"
Show the plot:
show(p)
The clear trend in this visualisation is that as the number of coronavirus cases increases, the maximum stringency level also increases. This is expected as governments begin to introduce stricter measures in an attempt to curb the virus. Majority of countries reached a maximum stringency level of greater than 80, which means that many governments have adopted strict measures effectively shutting down countries to try and stop the spread of coronavirus. What is interesting to now look at is how these high stringency levels have impacted the pandemic curve.
This visualisation looks at how the stringency index has impacted the pandemic curve. The pandemic curve is the same as seen previously; however, this time, the bars have been coloured to represent the stringency index.
Initialise focus countries:
focus_countries = set(
["AUS", "CHN", "DNK", "FRA", "IND", "IRN", "ITA", "MEX", "SWE", "USA"])
separate the focus countries from the data
data_stringency_focused = stringency_df5.loc[focus_countries].T
data_stringency_focused = data_stringency_focused.reset_index()
data_stringency_focused = data_stringency_focused.iloc[:, 1:].astype('float64')
cases_data_focused = daily_cases_data.loc[focus_countries].T.iloc[:94,:]
cases_data_focused = cases_data_focused.reset_index()
Generate color scale:
viridis_100_scale = viridis(101)
bokeh_plots = []
Plot each of the different crime type on their own subplot
for index1, country in enumerate(focus_countries):
y = data_stringency_focused[country].astype(int)
colors = [viridis_100_scale[i] for i in y]
p = figure(
height=200,
width = 200,
title=country,
toolbar_location=None,
tools="",
x_axis_type="datetime"
)
p.yaxis[0].formatter.precision = 0
p.yaxis[0].formatter.power_limit_high = 2
p.vbar(
x=cases_data_focused['index'],
top=cases_data_focused[country],
width=50000000,
color=colors
)
bokeh_plots.append(p)
Add title:
titlediv = Div(
text=
"<h1 style='text-align: center'> Influences of GDP and Healthcare Expenditure on COVID19 cases</h1>",
sizing_mode='stretch_width'
)
Add color bar:
color_mapper = LinearColorMapper(palette="Viridis256", low=1, high=100)
color_bar = ColorBar(
color_mapper=color_mapper,
label_standoff=12,
border_line_color=None,
location=(0,100),
orientation='horizontal')
color_bar_plot = figure(title="Stringency Index", title_location="above",
height=300, width=400,
toolbar_location=None, min_border=0,
outline_line_color=None)
color_bar_plot.add_layout(color_bar, 'below')
color_bar_plot.title.align="center"
color_bar_plot.title.text_font_size = '12pt'
Generate plot layout:
firstrow = row(bokeh_plots[0:4])
secondrow = row(bokeh_plots[4:8])
thirdrow = row(bokeh_plots[8:10] + [color_bar_plot])
layout = column(titlediv,firstrow,secondrow,thirdrow)
show(layout)
As the curve rises to a peak, the stringency index increases with this trend. When reaching a peak, most countries had implemented strict government measures. In most cases, after the peak and the introduction of the strict lockdown measures as shown by the high stringency index, the number of active cases began to decrease. However, some countries have implemented strict measures before a peak is in sight. These countries include India, France, Denmark, Mexico and France. For many of these countries, the number of cases is still rising. This trend is perhaps preemptive of the governments and what is to come. For some countries, there is a clear build-up of the stringency index, showing that the government measures are slowing becoming stricter. However, for other countries, there was not stringency index, then a sudden jump was seen to a high stringency index, for example, France. This suggests governments trying to wait out the implementation lockdown measures but then being overwhelmed by the number of cases, throwing the country into sudden lockdown.
From all of the visualisations, it is clear that the impact of coronavirus has been felt worldwide. When looking at the evolution of the virus, we can see that the rate of spread has increased over the last six weeks. As a result of this, the exponential growth of the virus has become clear and increasingly strict government measures have been introduced as shown by the stringency index.
The pandemic curve showed that countries worldwide are at different stages of the pandemic. The date that a virus entered a country has also been seen to impact when government measures were introduced. As the number of cases increases, the government measures become stricter. This suggested the active monitoring of the pandemic by governments and how they are responding to the growing number of cases with stricter measures in an attempt to curve the spread of the virus. However, there are often other reasons why different government measures were enacted. GDP and healthcare expenditure had a great influence on the measures introduced. When looking at the healthcare capacities of the, it is clear that the countries with a greater healthcare capacity waited longer before implementing strict government measures. However, other government factors come into play. Countries whose government effectiveness and the rule of law indicators are high had greater control over the pandemic.
The stringency index, like the number of cases, has increased worldwide over the last six weeks as a result of the increase in cases. From a logistic regression model, the stringency index was seen to increase with the number of cases which, is expected as a result of the stricter government measures. Before reaching the peak of the curve, governments had enforced the strictest measures in their country (there may be differences between countries). In most cases, after the peak and the introduction of the strict lockdown measures, the number of active cases began to decrease, which, shows the positive impact of government measures.
ACAPS. 2020. #COVID19 Government Measures Dataset. [Online]. Available: https://www.acaps.org/covid19-government-measures-dataset. [Accessed 12 April 2020].
Johns Hopkins University. 2020. Coronavirus Resource Center. [Online]. Available: https://coronavirus.jhu.edu/. [Accessed 12 April 2020].
The World Bank. 2019. World Bank Open Data. [Online]. Available: https://data.worldbank.org/. [Accessed 12 April 2020].
University of Oxford. 2020. Coronavirus Government Response Tracker. [Online]. Available: https://www.bsg.ox.ac.uk/research/research-projects/coronavirus-government-response-tracker. [Accessed 12 April 2020].